SlideShare a Scribd company logo
1 of 4
Download to read offline
LOOKING INTO THE EYE OF THE BITS

                 REVERSE ENGINEERING USING MEMORY ANALYSIS

                                      FEBRUARY, 2011
                                       ASSAF NATIV

                                        INTRODUCTION

During the past three years I've been developing tools for research and implementation of a
new type of software analysis, which I will introduce in this paper. This new type of reverse
engineering allows recovering internal implementation details using only passive memory
analysis, and without requiring any disassembly. I will also discuss how to cope with the
challenge that applications (including DBs) are always in a state of flux - new versions, security
updates, etc., keep changing the memory structure. I will answer the question of supporting a
new version of the target application without seeing it.
I will discuss the added value of this new method of internals' recovery over the more common
method of disassembling and decompiling. I will also share my stockpile of common memory
patterns, written in Python, and explain the vast information that can be uncovered simply by
roaming about in memory land.
In my talk I will give a demonstration that will include a description of a security problem that I
found in Microsoft SQL Server (published during 2009), as a result of applying this
methodology. I will demonstrate how it is possible to recover the internal structures of a program
as complex as a DBMS, and how one can find the important core internals that should be
protected.
One major application of this technique is discussed, which is to gain the deep knowledge and
understanding of the inside building blocks and design of the target application, required to
implement monitoring. As far as I know this method of memory monitoring has never before
been used for security purposes. This method allows us to achieve a good view of the
application’s activity, on the one hand, while on the other hand minimizing the performance
impact (in contrast to methods that require extensive application logging, for example). It
depends on the existence of caching, pipelining and buffering of data to create a real time view
of the application’s activity. When applied efficiently it can be used to protect applications from
various exploits and thus can be adopted as an alternative to applying security patches to
products, especially when applying the patches comes at a very high cost (e.g. extensive testing
of applications, shutting down mission-critical applications, etc.).
Reverse-engineers may consider recovering internal implementations and data structures by
studying memory dumps difficult or not worth the hassle. In this paper you will see that not only
is this job not as complex as one may think, but it could also be more effective then traditional
SRE. I will show the benefits of this work in many real world examples. I will divide this problem
into four smaller subjects as following:
      Examine the tools one needs for the task
      Analyze all of the different primitives we ought to find in memory
      Discuss a simple way to define at a high level the structures and patterns to search for in
         memory
      Case study.
TOOLS

A lot has been said about the subject of SRE tools, and almost any debugger would be
sufficient for our needs. I find the Python interactive interpreter to be the most efficient
environment for carrying out research of this kind. As I research, the current status of the
interpreter holds my current knowledge of the inspected target. Any piece of information can be
easily accessed because it is all stored in global variables. Thanks to these benefits and many
more, one can “play” with the data and try to make some sense of it. On Win32 there is the
PyDBG module that enables a researcher to debug a process from a Python environment. An
alternative to PyDBG would be a tool I wrote for the task called pyMint, which is freely available
online.
The functionality one would be looking for in the debugger in use is:
    1. Displaying memory in various ways.
    2. Searching in memory in various ways.
    3. Gathering as much information about the memory as possible (e.g. page attributes,
        memory regions, heap structures and so on).
Displaying memory dumps could be done in Binary form, Dword form, ASCII, Unicode,
Graphical and more, and it’s better when all modes are accessible from one integrated
environment. A simple modification of the way the memory is shown can make the difference
between random-looking bits and bytes and a data structure with an apparent purpose. For
example here are two dumps of the same memory:

The first:
00   6C29    760A   6C29   760A   -   0100   0000   0000   0000   l)v.l)v.........
10   0100    0000   387B   C603   -   387B   C603   0000   0000   ....8{..8{......
20   0000    0000   0000   0000   -   4C7B   C603   4C7B   C603   ........L{..L{..
30   0000    0000   0000   0000   -   0000   0000   607B   C603   ............`{..
40   607B    C603   0000   0000   -   0000   0000   0000   0000   `{..............
50   747B    C603   747B   C603   -   0000   0000   0000   0000   t{..t{..........
60   0000    0000   887B   C603   -   887B   C603   0000   0000   .....{...{......
70   0000    0000   0000   0000   -   9C7B   C603   9C7B   C603   .........{...{..
80   0000    0000   0000   0000   -   0000   0000   B07B   C603   .............{..
90   B07B    C603   0000   0000   -   0000   0000   0000   0000   .{..............
A0   C47B    C603   C47B   C603   -   0000   0000   0000   0000   .{...{..........
B0   0000    0000   D87B   C603   -   D87B   C603   0000   0000   .....{...{......
C0   0000    0000   0000   0000   -   EC7B   C603   EC7B   C603   .........{...{..
D0   0000    0000   0000   0000   -   0000   0000   007C   C603   .............|..
E0   007C    C603   0000   0000   -   0000   0000   0000   0000   .|..............
F0   147C    C603   147C   C603   -   0000   0000   0000   0000   .|...|..........

And the second:
 0    A76296C        A76296C                   1              0                 1
14    3C67B38        3C67B38                   0              0                 0
28    3C67B4C        3C67B4C                   0              0                 0
3c    3C67B60        3C67B60                   0              0                 0
50    3C67B74        3C67B74                   0              0                 0
64    3C67B88        3C67B88                   0              0                 0
78    3C67B9C        3C67B9C                   0              0                 0
8c    3C67BB0        3C67BB0                   0              0                 0
a0    3C67BC4        3C67BC4                   0              0                 0
b4    3C67BD8        3C67BD8                   0              0                 0
c8    3C67BEC      3C67BEC                 0            0                        0
dc    3C67C00      3C67C00                 0            0                        0
f0    3C67C14      3C67C14                 0            0                        0

The first dump looks like a bunch of bytes that make no sense, while the second looks like a
table in which every entry starts with two pointers followed by 3 numbers. A good (and correct,
in this case) guess would be that this is an open hash table, where the first two Dwords are the
next / prev pointers of the linked list and the following number is the number of items in the
bucket.
Another interesting way to inspect memory is graphical, and it was used in a tool called
Kartograph. This tool was created by Elie Bursztein to produce map hacks for strategy games.



                                      MEMORY IN DETAIL

In order to classify the primitives found in memory, I’ve divided them into four groups.
     1. Pointers
     2. Data
             a. Text
             b. Time stamps
             c. etc.
     3. Completely Random
     4. Code
Pointers tend to have the virtue of pointing to something in memory, which helps identify them.
Furthermore, the CPU handles Dword aligned addresses better, which makes the compiler,
heap or OS try to make pointers aligned if possible. This means most pointers end in either 0, 4,
8 or 0xc.
“Data” is anything that is found in memory, that has a meaning such as IDs, handles, names,
etc. “Data” is simply identified by prior knowledge of what it means, for instance if I know that my
session ID is 0x33, finding 0x33 in a memory array would guide me in the memory maze.
Contrary to common belief, truly random numbers are hardly ever found in memory.
Furthermore, even memory that is not allocated at all and is not referred to by any code is not
filled with random data, but with whatever was in that memory the last time it was used. In fact,
when one encounters a buffer in memory that seems to be randomly generated, it usually
corresponds to encrypted data, compressed data, hash digest or a pseudo-random numbers
buffer, which is helpful when trying to recover some logic.
To identify code one should be familiar with some assembly encoding. Almost every kind of
CPU has it’s own signatures for functions prologue / epilogue and common code. Most
debuggers do a good job in separating the code from the data, and for an exotic CPU a new
code searching function could be written in a matter of hours. If we take, for example, x86 and
the Visual Studio compiler, we can see that almost every function ends with 0xc3 0x90 0x90
0x90 0x90 which is the RET opcode followed by four NOPs (Used for the MS detours library).


                                         FUTURE WORK

Currently the implementation is not complete and I’ve focused on the aspects that were
necessary for my work at Sentrigo. There is also more to considered for future work:
   Adding features such as RegExp, faster memory scanning, better memory map query
       and more to the Candy / Mint Python modules. Although, I didn’t use any of these on my
       projects, other people may find these kinds of features more essential.
      Writing an Action-Script VM (Flash) in memory debugger / editor. This kind of tool could
       make Flash debugging and developing much more effective.
      Creating a proof-of-concept web server monitor. I do believe that the well proven
       security and monitoring method implemented by Sentrigo, should be used for many
       other applications.
      Considering malware, it is interesting to check what kind of data a virus can harvest from
       a target by monitoring the memory and staying completely invisible to logs and monitors.
       On the other hand, anti-viruses could use some of the techniques described here to
       search for and locate viruses and make signatures for them.




                                           THANKS

      The Sensor team @ Sentrigo and the rest of Sentrigo for the time and effort and the
       great product of Hedgehog.
      Elie Bursztein for Kartograph.
      Roy Fox, Anna Trainin for proofing this paper.
      Anyone who contributes to the source.




                                             REFS

      My Python Win32 memory inspector module: http://code.google.com/p/pymint/
      Patterns constructing and searching Python module: http://code.google.com/p/pycandy/
      My lame blog: http://nativassaf.blogspot.com/
      Python interactive interpreter that I use: http://dreampie.sourceforge.net/
      Python Win32 debugger module: http://pedram.redhive.com/PyDbg/docs/
      Kartograph: http://elie.im/talks/kartograph (Also on Defcon 2010 website)
      Microsoft detours library: http://research.microsoft.com/en-us/projects/detours/




CONTACT DETAILS
Assaf Nativ
Tel-Aviv, Israel
+972-505237809
Nativ.Assaf@gmail.com

More Related Content

Viewers also liked

An analysis of a facebook spam exploited through browser add-ons - Whitepaper
An analysis of a facebook spam exploited through browser add-ons - WhitepaperAn analysis of a facebook spam exploited through browser add-ons - Whitepaper
An analysis of a facebook spam exploited through browser add-ons - Whitepapern|u - The Open Security Community
 
Web Application Finger Printing - Methods/Techniques and Prevention
Web Application Finger Printing - Methods/Techniques and PreventionWeb Application Finger Printing - Methods/Techniques and Prevention
Web Application Finger Printing - Methods/Techniques and Preventionn|u - The Open Security Community
 
nullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentation
nullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentationnullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentation
nullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentationn|u - The Open Security Community
 

Viewers also liked (16)

Security Issues in Android Custom Rom - Whitepaper
Security Issues in Android Custom Rom - WhitepaperSecurity Issues in Android Custom Rom - Whitepaper
Security Issues in Android Custom Rom - Whitepaper
 
An analysis of a facebook spam exploited through browser add-ons - Whitepaper
An analysis of a facebook spam exploited through browser add-ons - WhitepaperAn analysis of a facebook spam exploited through browser add-ons - Whitepaper
An analysis of a facebook spam exploited through browser add-ons - Whitepaper
 
Cracking Salted Hashes
Cracking Salted HashesCracking Salted Hashes
Cracking Salted Hashes
 
nullcon 2011 - Penetration Testing a Biometric System
nullcon 2011 - Penetration Testing a Biometric Systemnullcon 2011 - Penetration Testing a Biometric System
nullcon 2011 - Penetration Testing a Biometric System
 
Cracking CTFs - Sysbypass CTF Walkthrough
Cracking CTFs - Sysbypass CTF WalkthroughCracking CTFs - Sysbypass CTF Walkthrough
Cracking CTFs - Sysbypass CTF Walkthrough
 
Club hack 2011 precon ctf walkthrough
Club hack 2011 precon ctf walkthroughClub hack 2011 precon ctf walkthrough
Club hack 2011 precon ctf walkthrough
 
Web Application Finger Printing - Methods/Techniques and Prevention
Web Application Finger Printing - Methods/Techniques and PreventionWeb Application Finger Printing - Methods/Techniques and Prevention
Web Application Finger Printing - Methods/Techniques and Prevention
 
Phishing and being phished!
Phishing and being phished!Phishing and being phished!
Phishing and being phished!
 
nullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentation
nullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentationnullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentation
nullcon 2011 - Automatic Program Analysis using Dynamic Binary Instrumentation
 
Project Jugaad
Project JugaadProject Jugaad
Project Jugaad
 
Humla workshop on Android Security Testing - null Singapore
Humla workshop on Android Security Testing - null SingaporeHumla workshop on Android Security Testing - null Singapore
Humla workshop on Android Security Testing - null Singapore
 
OSSIM Overview
OSSIM OverviewOSSIM Overview
OSSIM Overview
 
Legiment Techniques of IDS/IPS Evasion
Legiment Techniques of IDS/IPS EvasionLegiment Techniques of IDS/IPS Evasion
Legiment Techniques of IDS/IPS Evasion
 
Identifying XSS Vulnerabilities
Identifying XSS VulnerabilitiesIdentifying XSS Vulnerabilities
Identifying XSS Vulnerabilities
 
Attacking VPN's
Attacking VPN'sAttacking VPN's
Attacking VPN's
 
Newbytes NullHyd
Newbytes NullHydNewbytes NullHyd
Newbytes NullHyd
 

Similar to nullcon 2011 - Memory analysis – Looking into the eye of the bits

Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryInterview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryPVS-Studio
 
Back To The Future.Key 2
Back To The Future.Key 2Back To The Future.Key 2
Back To The Future.Key 2gueste8cc560
 
APEX Connect 2019 - array/bulk processing in PLSQL
APEX Connect 2019 - array/bulk processing in PLSQLAPEX Connect 2019 - array/bulk processing in PLSQL
APEX Connect 2019 - array/bulk processing in PLSQLConnor McDonald
 
Data oriented design and c++
Data oriented design and c++Data oriented design and c++
Data oriented design and c++Mike Acton
 
Server-Side Development for the Cloud
Server-Side Developmentfor the CloudServer-Side Developmentfor the Cloud
Server-Side Development for the CloudMichael Rosenblum
 
Patterns for organic architecture codedive
Patterns for organic architecture codedivePatterns for organic architecture codedive
Patterns for organic architecture codedivemagda3695
 
Errors detected in C++Builder
Errors detected in C++BuilderErrors detected in C++Builder
Errors detected in C++BuilderPVS-Studio
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overviewMarc Seeger
 
Monitoring a program that monitors computer networks
Monitoring a program that monitors computer networksMonitoring a program that monitors computer networks
Monitoring a program that monitors computer networksAndrey Karpov
 
Interpreting the data parallel analysis with sawzall
Interpreting the data  parallel analysis with sawzallInterpreting the data  parallel analysis with sawzall
Interpreting the data parallel analysis with sawzallLee David
 
Robotics Toolbox for MATLAB (Relese 9)
Robotics Toolbox for MATLAB (Relese 9)Robotics Toolbox for MATLAB (Relese 9)
Robotics Toolbox for MATLAB (Relese 9)CHIH-PEI WEN
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects Andrey Karpov
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton insertsChris Adkin
 
Vision Algorithmics
Vision AlgorithmicsVision Algorithmics
Vision Algorithmicspotaters
 
PVS-Studio delved into the FreeBSD kernel
PVS-Studio delved into the FreeBSD kernelPVS-Studio delved into the FreeBSD kernel
PVS-Studio delved into the FreeBSD kernelPVS-Studio
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisBrendan Gregg
 

Similar to nullcon 2011 - Memory analysis – Looking into the eye of the bits (20)

Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ libraryInterview with Anatoliy Kuznetsov, the author of BitMagic C++ library
Interview with Anatoliy Kuznetsov, the author of BitMagic C++ library
 
No more dumb hex!
No more dumb hex!No more dumb hex!
No more dumb hex!
 
Back To The Future.Key 2
Back To The Future.Key 2Back To The Future.Key 2
Back To The Future.Key 2
 
Mach-O par Stéphane Sudre
Mach-O par Stéphane SudreMach-O par Stéphane Sudre
Mach-O par Stéphane Sudre
 
APEX Connect 2019 - array/bulk processing in PLSQL
APEX Connect 2019 - array/bulk processing in PLSQLAPEX Connect 2019 - array/bulk processing in PLSQL
APEX Connect 2019 - array/bulk processing in PLSQL
 
Data oriented design and c++
Data oriented design and c++Data oriented design and c++
Data oriented design and c++
 
Server-Side Development for the Cloud
Server-Side Developmentfor the CloudServer-Side Developmentfor the Cloud
Server-Side Development for the Cloud
 
Patterns for organic architecture codedive
Patterns for organic architecture codedivePatterns for organic architecture codedive
Patterns for organic architecture codedive
 
Errors detected in C++Builder
Errors detected in C++BuilderErrors detected in C++Builder
Errors detected in C++Builder
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
 
Monitoring a program that monitors computer networks
Monitoring a program that monitors computer networksMonitoring a program that monitors computer networks
Monitoring a program that monitors computer networks
 
Ijnsa050206
Ijnsa050206Ijnsa050206
Ijnsa050206
 
Interpreting the data parallel analysis with sawzall
Interpreting the data  parallel analysis with sawzallInterpreting the data  parallel analysis with sawzall
Interpreting the data parallel analysis with sawzall
 
Robotics Toolbox for MATLAB (Relese 9)
Robotics Toolbox for MATLAB (Relese 9)Robotics Toolbox for MATLAB (Relese 9)
Robotics Toolbox for MATLAB (Relese 9)
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Super scaling singleton inserts
Super scaling singleton insertsSuper scaling singleton inserts
Super scaling singleton inserts
 
Vision Algorithmics
Vision AlgorithmicsVision Algorithmics
Vision Algorithmics
 
PVS-Studio delved into the FreeBSD kernel
PVS-Studio delved into the FreeBSD kernelPVS-Studio delved into the FreeBSD kernel
PVS-Studio delved into the FreeBSD kernel
 
MeetBSD2014 Performance Analysis
MeetBSD2014 Performance AnalysisMeetBSD2014 Performance Analysis
MeetBSD2014 Performance Analysis
 
Final Document
Final DocumentFinal Document
Final Document
 

More from n|u - The Open Security Community

Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...
Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...
Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...n|u - The Open Security Community
 

More from n|u - The Open Security Community (20)

Hardware security testing 101 (Null - Delhi Chapter)
Hardware security testing 101 (Null - Delhi Chapter)Hardware security testing 101 (Null - Delhi Chapter)
Hardware security testing 101 (Null - Delhi Chapter)
 
Osint primer
Osint primerOsint primer
Osint primer
 
SSRF exploit the trust relationship
SSRF exploit the trust relationshipSSRF exploit the trust relationship
SSRF exploit the trust relationship
 
Nmap basics
Nmap basicsNmap basics
Nmap basics
 
Metasploit primary
Metasploit primaryMetasploit primary
Metasploit primary
 
Api security-testing
Api security-testingApi security-testing
Api security-testing
 
Introduction to TLS 1.3
Introduction to TLS 1.3Introduction to TLS 1.3
Introduction to TLS 1.3
 
Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...
Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...
Gibson 101 -quick_introduction_to_hacking_mainframes_in_2020_null_infosec_gir...
 
Talking About SSRF,CRLF
Talking About SSRF,CRLFTalking About SSRF,CRLF
Talking About SSRF,CRLF
 
Building active directory lab for red teaming
Building active directory lab for red teamingBuilding active directory lab for red teaming
Building active directory lab for red teaming
 
Owning a company through their logs
Owning a company through their logsOwning a company through their logs
Owning a company through their logs
 
Introduction to shodan
Introduction to shodanIntroduction to shodan
Introduction to shodan
 
Cloud security
Cloud security Cloud security
Cloud security
 
Detecting persistence in windows
Detecting persistence in windowsDetecting persistence in windows
Detecting persistence in windows
 
Frida - Objection Tool Usage
Frida - Objection Tool UsageFrida - Objection Tool Usage
Frida - Objection Tool Usage
 
OSQuery - Monitoring System Process
OSQuery - Monitoring System ProcessOSQuery - Monitoring System Process
OSQuery - Monitoring System Process
 
DevSecOps Jenkins Pipeline -Security
DevSecOps Jenkins Pipeline -SecurityDevSecOps Jenkins Pipeline -Security
DevSecOps Jenkins Pipeline -Security
 
Extensible markup language attacks
Extensible markup language attacksExtensible markup language attacks
Extensible markup language attacks
 
Linux for hackers
Linux for hackersLinux for hackers
Linux for hackers
 
Android Pentesting
Android PentestingAndroid Pentesting
Android Pentesting
 

Recently uploaded

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentationphoebematthew05
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 

Recently uploaded (20)

The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
costume and set research powerpoint presentation
costume and set research powerpoint presentationcostume and set research powerpoint presentation
costume and set research powerpoint presentation
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 

nullcon 2011 - Memory analysis – Looking into the eye of the bits

  • 1. LOOKING INTO THE EYE OF THE BITS REVERSE ENGINEERING USING MEMORY ANALYSIS FEBRUARY, 2011 ASSAF NATIV INTRODUCTION During the past three years I've been developing tools for research and implementation of a new type of software analysis, which I will introduce in this paper. This new type of reverse engineering allows recovering internal implementation details using only passive memory analysis, and without requiring any disassembly. I will also discuss how to cope with the challenge that applications (including DBs) are always in a state of flux - new versions, security updates, etc., keep changing the memory structure. I will answer the question of supporting a new version of the target application without seeing it. I will discuss the added value of this new method of internals' recovery over the more common method of disassembling and decompiling. I will also share my stockpile of common memory patterns, written in Python, and explain the vast information that can be uncovered simply by roaming about in memory land. In my talk I will give a demonstration that will include a description of a security problem that I found in Microsoft SQL Server (published during 2009), as a result of applying this methodology. I will demonstrate how it is possible to recover the internal structures of a program as complex as a DBMS, and how one can find the important core internals that should be protected. One major application of this technique is discussed, which is to gain the deep knowledge and understanding of the inside building blocks and design of the target application, required to implement monitoring. As far as I know this method of memory monitoring has never before been used for security purposes. This method allows us to achieve a good view of the application’s activity, on the one hand, while on the other hand minimizing the performance impact (in contrast to methods that require extensive application logging, for example). It depends on the existence of caching, pipelining and buffering of data to create a real time view of the application’s activity. When applied efficiently it can be used to protect applications from various exploits and thus can be adopted as an alternative to applying security patches to products, especially when applying the patches comes at a very high cost (e.g. extensive testing of applications, shutting down mission-critical applications, etc.). Reverse-engineers may consider recovering internal implementations and data structures by studying memory dumps difficult or not worth the hassle. In this paper you will see that not only is this job not as complex as one may think, but it could also be more effective then traditional SRE. I will show the benefits of this work in many real world examples. I will divide this problem into four smaller subjects as following:  Examine the tools one needs for the task  Analyze all of the different primitives we ought to find in memory  Discuss a simple way to define at a high level the structures and patterns to search for in memory  Case study.
  • 2. TOOLS A lot has been said about the subject of SRE tools, and almost any debugger would be sufficient for our needs. I find the Python interactive interpreter to be the most efficient environment for carrying out research of this kind. As I research, the current status of the interpreter holds my current knowledge of the inspected target. Any piece of information can be easily accessed because it is all stored in global variables. Thanks to these benefits and many more, one can “play” with the data and try to make some sense of it. On Win32 there is the PyDBG module that enables a researcher to debug a process from a Python environment. An alternative to PyDBG would be a tool I wrote for the task called pyMint, which is freely available online. The functionality one would be looking for in the debugger in use is: 1. Displaying memory in various ways. 2. Searching in memory in various ways. 3. Gathering as much information about the memory as possible (e.g. page attributes, memory regions, heap structures and so on). Displaying memory dumps could be done in Binary form, Dword form, ASCII, Unicode, Graphical and more, and it’s better when all modes are accessible from one integrated environment. A simple modification of the way the memory is shown can make the difference between random-looking bits and bytes and a data structure with an apparent purpose. For example here are two dumps of the same memory: The first: 00 6C29 760A 6C29 760A - 0100 0000 0000 0000 l)v.l)v......... 10 0100 0000 387B C603 - 387B C603 0000 0000 ....8{..8{...... 20 0000 0000 0000 0000 - 4C7B C603 4C7B C603 ........L{..L{.. 30 0000 0000 0000 0000 - 0000 0000 607B C603 ............`{.. 40 607B C603 0000 0000 - 0000 0000 0000 0000 `{.............. 50 747B C603 747B C603 - 0000 0000 0000 0000 t{..t{.......... 60 0000 0000 887B C603 - 887B C603 0000 0000 .....{...{...... 70 0000 0000 0000 0000 - 9C7B C603 9C7B C603 .........{...{.. 80 0000 0000 0000 0000 - 0000 0000 B07B C603 .............{.. 90 B07B C603 0000 0000 - 0000 0000 0000 0000 .{.............. A0 C47B C603 C47B C603 - 0000 0000 0000 0000 .{...{.......... B0 0000 0000 D87B C603 - D87B C603 0000 0000 .....{...{...... C0 0000 0000 0000 0000 - EC7B C603 EC7B C603 .........{...{.. D0 0000 0000 0000 0000 - 0000 0000 007C C603 .............|.. E0 007C C603 0000 0000 - 0000 0000 0000 0000 .|.............. F0 147C C603 147C C603 - 0000 0000 0000 0000 .|...|.......... And the second: 0 A76296C A76296C 1 0 1 14 3C67B38 3C67B38 0 0 0 28 3C67B4C 3C67B4C 0 0 0 3c 3C67B60 3C67B60 0 0 0 50 3C67B74 3C67B74 0 0 0 64 3C67B88 3C67B88 0 0 0 78 3C67B9C 3C67B9C 0 0 0 8c 3C67BB0 3C67BB0 0 0 0 a0 3C67BC4 3C67BC4 0 0 0 b4 3C67BD8 3C67BD8 0 0 0
  • 3. c8 3C67BEC 3C67BEC 0 0 0 dc 3C67C00 3C67C00 0 0 0 f0 3C67C14 3C67C14 0 0 0 The first dump looks like a bunch of bytes that make no sense, while the second looks like a table in which every entry starts with two pointers followed by 3 numbers. A good (and correct, in this case) guess would be that this is an open hash table, where the first two Dwords are the next / prev pointers of the linked list and the following number is the number of items in the bucket. Another interesting way to inspect memory is graphical, and it was used in a tool called Kartograph. This tool was created by Elie Bursztein to produce map hacks for strategy games. MEMORY IN DETAIL In order to classify the primitives found in memory, I’ve divided them into four groups. 1. Pointers 2. Data a. Text b. Time stamps c. etc. 3. Completely Random 4. Code Pointers tend to have the virtue of pointing to something in memory, which helps identify them. Furthermore, the CPU handles Dword aligned addresses better, which makes the compiler, heap or OS try to make pointers aligned if possible. This means most pointers end in either 0, 4, 8 or 0xc. “Data” is anything that is found in memory, that has a meaning such as IDs, handles, names, etc. “Data” is simply identified by prior knowledge of what it means, for instance if I know that my session ID is 0x33, finding 0x33 in a memory array would guide me in the memory maze. Contrary to common belief, truly random numbers are hardly ever found in memory. Furthermore, even memory that is not allocated at all and is not referred to by any code is not filled with random data, but with whatever was in that memory the last time it was used. In fact, when one encounters a buffer in memory that seems to be randomly generated, it usually corresponds to encrypted data, compressed data, hash digest or a pseudo-random numbers buffer, which is helpful when trying to recover some logic. To identify code one should be familiar with some assembly encoding. Almost every kind of CPU has it’s own signatures for functions prologue / epilogue and common code. Most debuggers do a good job in separating the code from the data, and for an exotic CPU a new code searching function could be written in a matter of hours. If we take, for example, x86 and the Visual Studio compiler, we can see that almost every function ends with 0xc3 0x90 0x90 0x90 0x90 which is the RET opcode followed by four NOPs (Used for the MS detours library). FUTURE WORK Currently the implementation is not complete and I’ve focused on the aspects that were necessary for my work at Sentrigo. There is also more to considered for future work:
  • 4. Adding features such as RegExp, faster memory scanning, better memory map query and more to the Candy / Mint Python modules. Although, I didn’t use any of these on my projects, other people may find these kinds of features more essential.  Writing an Action-Script VM (Flash) in memory debugger / editor. This kind of tool could make Flash debugging and developing much more effective.  Creating a proof-of-concept web server monitor. I do believe that the well proven security and monitoring method implemented by Sentrigo, should be used for many other applications.  Considering malware, it is interesting to check what kind of data a virus can harvest from a target by monitoring the memory and staying completely invisible to logs and monitors. On the other hand, anti-viruses could use some of the techniques described here to search for and locate viruses and make signatures for them. THANKS  The Sensor team @ Sentrigo and the rest of Sentrigo for the time and effort and the great product of Hedgehog.  Elie Bursztein for Kartograph.  Roy Fox, Anna Trainin for proofing this paper.  Anyone who contributes to the source. REFS  My Python Win32 memory inspector module: http://code.google.com/p/pymint/  Patterns constructing and searching Python module: http://code.google.com/p/pycandy/  My lame blog: http://nativassaf.blogspot.com/  Python interactive interpreter that I use: http://dreampie.sourceforge.net/  Python Win32 debugger module: http://pedram.redhive.com/PyDbg/docs/  Kartograph: http://elie.im/talks/kartograph (Also on Defcon 2010 website)  Microsoft detours library: http://research.microsoft.com/en-us/projects/detours/ CONTACT DETAILS Assaf Nativ Tel-Aviv, Israel +972-505237809 Nativ.Assaf@gmail.com